36        Bioinformatics

directory “preprocessing”, then download the FASTQ file from the NCBI SRA ­database,

and finally rename it to “bad.fastq” file for the practice purpose. The script then generates

the QC FastQC report and displays the report on the Firefox browser.

mkdir preprocessing

cd preprocessing

fasterq-dump --verbose SRR957824

rm SRR957824_1.fastq

mv SRR957824_2.fastq bad.fastq

fqfile=$(ls *.fastq)

fastqc $fqfile

htmlfile=$(ls *.html)

firefox $htmlfile

When all the commands have been executed sequentially without an error, the QC report

will be displayed on the Firefox Internet browser. Study the reports carefully and identify

any potential problems on the quality metrics that we have discussed in the previous sec-

tion. Figure 1.29 shows that the reads in the file have three failures and a single warning.

Next, we will try to fix these problems as possible.

Using such FASTQ file in the downstream analysis without fixing some of the quality

problems will definitely impact the results negatively and may lead to misleading results.

The good strategy whenever there are warnings or failures is to try the available ways to fix

the problems as possible, and if there is any unfixable problem, you may need to be aware

of it and to know how it may affect the results.

FIGURE 1.29  The QC report summary and per base sequence quality for “bad.fastq” file.